Caio Raphael

"Ownership determines destruction strategy. The stronger the ownership, the simpler the destruction should be."
The entity always owns the Object. Necessarily the Object should die when the entity dies.

Motivation

Disclaimer

This analysis I'm making below is from a challenge I was facing in my Odin engine.
It's more of a sketch of what was going on in my head at the time. Some things probably doesn't make much sense, unless in the proper context.

Back to the analysis

Annoying places that led me to think about the subject:
- bodies
- layers.
- tweens.
- ~timers
  - if I rework the system.
- world init / world deinit is very easy to forget.
  - The 'struct-based lifetime aggregation' system helped a lot with this.
Post about the subject, focusing on the tween system:
```
eng.tween(
    value = &some_vector_inside_an_entity,
    end = some_vector,
    duration_s = 0.1,
    on_end = proc(tween: ^eng.Tween) {
        // stuff
    },
)
```
- This is how I call a tween. The only thing it needs to do for this tween to work is call eng.tween_system_init() at the beginning of the game, eng.tween_system_deinit() at the end of it, and poll eng.tween_system_update(dt) every frame for the tween_system to process every tween stored when calling eng.tween(..) .
- When a tween ends processing, it gets removed automatically by the eng.tween_system_update(dt) .
- The problem is kind of obvious: if the entity dies, the tween has a reference to a pointer inside the entity and the system crashes; UAF.
- I came up with 5 major ideas to solve this problem, but I'm not really happy with them, or I don't know if they are good practice:
  - 1 - The entity stores the tween or a handle to the tween, so when the entity dies, I can manually ask to remove the tween and everything is ok.
  - 2 - The tween doesn't store a pointer to some information in the entity, but a handle to it.
  - 3 - The tween_system doesn't exist. The entity has a tween stored, and MANUALLY calls a new function called tween_process(&the_entity_tween) every frame of the entity update . The tween doesn't need to be destroyed, as it doesn't own the value pointer. The existence of the tween is tightly linked to the existence of the entity. Just stop calling tween_update and you won't have a UAF.
  - 4 - The entity doesn't care about the existence of the tween and can die in peace. The tween knows when the entity died and removes "bad tweens" and never tries to access freed memory. This could be done through:
  - 4.1 - The entity has a "pointer to lifetime handle" inside of it. When calling the tween, it passes this handle as an argument for the tween. Every time an object stores this handle, its internal counter goes up by 1. When the entity dies, it changes the handle state to dead = true and subtracts 1 from the counter. The eng.tween_system_update(dt) checks if the handle is dead or not; if so, it frees the "bad tween" and subtracts 1 from the counter. You got the idea. The counter only serves as a way to analyze the memory on game deinit.
  - 4.2 - The entity has an "event struct". When calling the tween, it passes a pointer to the event so the tween can register a function pointer to be called when the event is emitted; the content inside this function pointer would be the destructor for the tween. When the entity dies, it calls something like event_emit(the_entity_destructor_event) , and every system that subscribed to it will be destroyed. This kinda reminds me a bit of RAII.
- My thoughts on each strategy:
  - 1, 2 and 3: 3 is clearly the best in terms of safety, but the problem with all 3 is that it makes the entity aware of the existence of a tween. The tween needs to know how many tweens will be used at the same time beforehand, or use an array that will need to be destructed. This makes the API look less pleasing. So many systems could use a tween and it sucks that it would be just a simple plug-n-play. I wish I could call tween(..) and be done with it; let the tween_system handle the rest.
  - 4.1 and 4.2: They both sound a bit cheesy. The tween SHOULD die as the entity dies, as it has a pointer to it, but this strategy throws the responsibility to a 3rd party (a lifetime handle or event emitter) to manage the problem. This introduces a level of abstraction that I'm not too fond of.
- Finally, if I were to choose, I'd go for the 4.2 to clean the API, or go back to 3 , if it ends up not being a good idea.

Ideas

The object is stored internally in the Entity :
1. The Object doesn't have to worry about being destroyed, since no one outside the Entity references it .
  - Access is intrinsically tied to the lifetime.
  - 'Stateless' seems to be a word that defines this strategy.
  - Examples where I use this :
    - Timer system.
      - Need to advance time manually via timer_is_finished .
      - In that case, I would do en.tween_update(^Tween, dt) .
        
        The dt is optional, like in the Timer case.
  - For a Tween :
    1. The entity has a Tween_System
      - Basically to have an array instead of unitary values.
      - Use a single update.
    2. The entity has multiple Tweens
      - Requires that I know the maximum number of concurrent tweens I will have at any moment.
    - The Tween can still have on_enter and on_end functions, no problem.
    - A Tween can be torn down and rebuilt however you want, so when a tween finishes I can create another tween immediately afterwards using the same Tween.
    - The Tween never needs to be destroyed. It doesn't own anything.
2. The Object must be destroyed manually during the Entity's destruction :
  - When the Entity is destroyed, it calls the destruction of the Objects it owns.
  - Problems :
    - Compared to "The Object doesn't have to worry about being destroyed, since no one outside the Entity references it":
      - forgetting to update: no big deal.
      - forgetting to destroy: memory leak and UAF.
      - So this technique is less safe.
  - Examples where I use this :
    - Bodies / Body Wrappers system.
      - This system makes sense, since a Body needs to be destroyed in Jolt anyway, so a destructor call is inevitable.
  - For a Tween :
    1. The entity has a Tween_System
      - Basically to have an array instead of unitary values.
      - Use a single update.
    2. The entity has multiple Tweens
      - Requires that I know the maximum number of concurrent tweens I will have at any moment.
    - eng.tween returns a handle to the tween.
    - This can be annoying for chaining tweens, having to always have a place to store the tween data.
      - In this case, I could implement a system where tween chaining happens without the need for on_end, via use of eng.tween_chain , something like that.
    - The tween update is done automatically via eng.tween_update(dt) , called directly by the Game Loop, each desired frame.
    - Considering that a tween destroys itself automatically when finished, it can be weird to call the destruction of something that's already been destroyed.
      - ~~The tween might not destroy itself automatically, but that makes things a bit more annoying, increasing the verbosity of the tween callback to always destroy itself.~~
        
        proc(tween: ^Tween) { eng.tween_destroy(tween) //or eng.tween_destroy(tween.value) //if using a handle. }
        
        This changes NOTHING. Explicitly destroying the tween in the callback still requires me to destroy it when destroying the entity. Nothing changes.
3. Use of handles :
  - The tween doesn't store a pointer to some information in the entity, but a handle to the information.
  - This means that the entity must have the handle stored, so when the entity dies, the handle is correctly removed.
  - The tween system then tries to tween the data, but when attempting to get the information inside the handle, it realizes the handle is now dead and deletes the tween.
  - When calling the tween:
    - We need to pass the handle for the data, instead of the data; so no new argument is added.
    - We maybe need to pass the handle_map related to the handle.
      - An alternative is to use a global handle_map.
  - Disadvantages :
    - I have to store a handle for every pointer I want to tween, which means duplicating the data.
      - I would have a value and a pointer to this value, wrapped around a handle.
      - That's really weird.
      - Another option would be to store the data somewhere else and just have the handle around for the entity, but this could be even weirder.
    - A handle_map for rawptr could be an annoyance for generic procedures, such as the tween_system.
  - About the handle map :
    1. The handle_map has the same (or higher) lifetime as the tween system.
      1. The handle_map is implicit, without having to pass it around:
        
        This means that a generic handle_map would have to be used, storing rawptr .
        
        This could be an annoyance for the Tween system, as I wouldn't have many ways of checking if the data for the value and end is compatible with each other.
        
        Overall, it's a bit annoying for generics.
        
        The handle_map is part of the game.
        
        ok.
        
        ~~The handle_map is global~~.
        
        I don't think anything outside the game would want to tween something.
        
        ~~The handle_map is part of the scene~~.
        
        When exiting the scene, the handle_map is cleared.
        
        If the handle_map used by the tween_system is exclusively the handle_map of the current scene, that means that we can't tween something during a scene change, as this means that the whole tween_system would be cleared on a scene deinit.
      2. ~~The handle_map is explicit, having to pass it around~~:
        
        This increases the argument for calling a tween by one, and also makes the tween store the handle_map, besides the handle.
        
        The handle_map could be more specific, but this is not an advantage necessarily, as:
        
        Most of the data processed by the tween is different from each other; could be a f32, Vec2, int, etc.
        
        I don't see this strategy having any real advantage.
        
        The handle_map is part of the scene.
        
        This would make the most sense, as the only reason for having an explicit handle_map is to have many options for handle_maps.
    2. ~~The entity has a handle_map~~.
      - If the entity dies, the handle_map dies as well, crashing the whole system and making the use of the handle pointless.
  - Advantages :
    - The entity doesn't need to store any tween. We can have any amount of tweens, without any problems.
  - Disadvantages :
    - Forgetting to remove the handle will cause a UAF, as the tween system will be able to access the data that the handle represents.
4. ~~Game State~~ :
  - "Carrying around a Game_State object which stores all game data so deleting any resource goes through a single place."
  - The Game State stores the tweens and the entities.
  - The entity is destroyed through the Game State game_state_del_entity(gs: ^Game_State, ent: Entity_Id) .
    - Which will also delete all data associated with that entity.
    - It can remove its tween because it has access to gs ; the entity needs to have an indicator of which tween belongs to the entity either way.
  - My thinking:
    - "The entity doesn't own anything"
      - Idk if this is true. When the entity dies some other piece of data MUST die as well.
      - It's a different way to think of ownership. The entity doesn't really own the tween, but the tween must die with the entity; so who owns the data? I mean, I can't say it doesn't have an owner, as that would imply that something like the GameState owns the tween, which isn't true as only destroying the data when deinitializing the Game State would cause a UAF.
      - Seems like the entity and tween are tied together, but if you look the other way around, if a tween dies that doesn't imply that an entity should die.
      - The entity clearly has a higher hierarchy than the tween when it comes to ownership, so I can safely say that the tween belongs to the entity.
      - So, with that in mind, what justifies the tween being out of the entity? What good does it bring to the destruction of the tween?
    - Sameness:
      - The entity needs to have a handle for the tween stored or the whole tween itself.
        
        Storing only a handle could be a little more problematic if the tween is not killed once the entity dies; but yet, that depends on how the fetch for the tweens is made in the game_state_tweens :: proc(gs: Game_State) -> []Tween .
    - Possible advantages:
      - Forgetting to clean up a tween will not cause a UAF, as the entity stored the tween and will take the tween with it when dying.
        
        This could cause a memory leak if the tween allocates memory in heap and we forget to clean it up, but it will not cause a crash.
        
        THO, this is only possible due to the system being stateless.
        
        This is also true for other strategies that store the tween inside the entity.
    - Possible disadvantages:
      - Same disadvantages of other strategies that have to store the tween or a handle to the tween in the entity:
        
        I need to know how many tweens I'll have upfront.
        
        The call for a tween could be problematic. If I call a new tween using the same tween_handle or the same tween, while the tween is being used for some other tweening, then this could fail somehow; I would have to overwrite what I asked the tween to do, or just fail the call altogether and say it couldn't be made; this is terrible as it introduces error handling in a simple system.
      - Being stateless means that every frame is necessary to fetch all tweens from everywhere in the system, to finally update each one of them.
        
        This is a problem when you consider that a tween could be anywhere, not only inside entities.
        
        It could be a mess to look up these tweens.
  - Finally:
    - The entity still needs to store a reference to the tween, while also the entity needs to know upfront how many tweens it will use, etc.
    - Even though your strategy could be used, it falls within the realm of the strategies 1 and 3 . I'm looking for a way that the entity doesn't care about the tweening. The tween could live in a general global space, being created freely, BUT still having its lifetime tied to the lifetime of a pointer the tween holds.
  - Some people tried to defend the strategy, but I still thought it was garbage and it doesn't deserve to be considered. It doesn't solve anything, keeps the problems of having to store the tweens internally and complicates the whole memory system a lot.
~~The Object is stored in a global system~~ :
- Problems :
  - ANY pointer is a problematic pointer, since the lifetime of the pointer will always be shorter than the lifetime of the tween system. This requires extra ways to destroy the tween before the pointer stored inside the tween is used.
1. Use of Events to destroy Objects :
  - The main characteristic of this system is that the destructor can be defined right when creating the Object, so you know when it will be destroyed; that is, when the event is emitted.
  - For tweens :
    - Why I stopped using this for tweens :
      - I had big lifetime problems across the whole game in the past, but after removing global variables this problem was greatly reduced, diminishing the arguments in favor of using events.
      - An internal system greatly reduces the complexity of the problem, making it much easier to read and understand what's happening.
      - The existence of a destructor for a tween is strange. The tween destroys itself on completion, causing the "destructor" not to be called in 99% of cases . In the vast majority of cases, event listeners were registered in the destructor and unregistered without ever being used.
        
        The destructor was just an anti-crash system; its purpose was only to handle an exception where tweens still existed while the pointer of their custom_data no longer existed.
    - After doing :
```
// tween()
eng.tween(
    destructor = &personagem.destructor,
    value = &personagem.arm3.pos_world,
    end = arm_relative_target_trans_arm3.pos,
    duration_s = 0.1,
    custom_data = personagem,
    on_end = proc(tween: ^eng.Tween) {
        personagem := cast(^Personagem_User)tween.custom_data
        personagem.arm3.is_stepping = false
    }
)

// Internal to tween()
ok: bool
tween.destructor_handle, ok = event_add_listener(destructor, wrap_procedure(tween_delete), rawptr(value))
if !ok {
    log.errorf("%v(%v): Failed to add event listener.", loc.procedure, loc.line)
    return nil
}
```
2. Use of an external Lifetimes Manager system :
  - Use a Lifetime_Handle as an indicator if the entity was destroyed.
  - Each of the individual object systems checks that Lifetime_Handle and will destroy the Object if it notices that the handle was marked as dead .
  - The check must be done in a loop; normally in the object's system loop.
  - This means it is intrinsically a polling system.
  - Problems :
    - If the polling is not done, then the entity will never be destroyed.
      - This is a problem that happens every time the game closes: the Lifetime_Handle is marked as dead, but no final polling is done to destroy the objects.
      - This can cause memory leaks.
        
        Although the Lifetime_Handles system can clean up the memory of each indicated thing, this system doesn't have enough information about how the destruction of each element should be performed. Besides, of course, that would be a pain to implement.

Improvements Applied

Stage 1: Clarity in the definition of lifetimes

"Grouped element thinking and systems (n+1)".
"struct-based lifetime aggregation".
- It was the solution I used.
Ideally I don't want to have any data stored in the global scope, except if .
All data should belong to a struct that represents its lifetime.
What does this help with?
- It gets closer to n+1
- Helps not to forget about initializations and deinitalizations.
- Makes some lifetimes explicit, making it harder to make UAF mistakes.
- Makes it much easier to use arena allocators for those objects with the same lifetime.
- I think it also makes explicit what has a 1-frame lifetime, for use in the temp allocator or another custom arena allocator.
- This can be very useful when dealing with a low-level render API.
- Overall: it seems to be an improvement in code clarity and quality.
Does this help the tween/external systems in any way?
- I don't think so.
- The storage of the data is done using another allocator, regardless.
- The data stored in the system cannot use the same allocator as the entity, since the system should not die when the entity dies.
Post-change impressions :
- The system helps to visualize improper access problems, that's all.
- In a nutshell, it's a way to stay aware of grouped lifetimes, it's "grouped thinking".
- Therefore it's ok, even if it might not be the best way to visualize/handle the problem.

Stage 2: Purity of functions, via removal of global variables and global structs

The answer about lifetimes lies in the absence of global variables or global structs.
Why I came up with this idea :
- When the lifetime of something ends, I wanted it to be impossible to access it.
- This reminded me of Scopes, which do exactly that.
Notable improvements :
- Multithreading is safer.
  - The net_connections thread doesn't have access to the game thread's content, since the game is on the worker_game stack.
  - This is the only big advantage in the separation between global and game .
    - There's also a gain in function purity, of course.
    - If the game didn't have one dedicated network thread, then game would be basically a global .
    - Even if game is a global variable (like in the crypted core) I would still pass the variable as a function parameter (for sure), for purity, helping to clarify what the function uses.
- I liked the solution for RPCs and Jobs, surprisingly.
  - The only thing done is passing the game as an extra arg in the context, via context.user_ptr ; that's it.
  - Simple, easy to understand, without requiring changes to the RPC or Job code.
    - If it weren't for that, I would have to pass the game as a procedure arg, having to modify the RPC and Job code accordingly.